Skip to content

Sample queue depth via observable gauge#81

Merged
loks0n merged 2 commits into
mainfrom
debug-failed-queue
Jun 1, 2026
Merged

Sample queue depth via observable gauge#81
loks0n merged 2 commits into
mainfrom
debug-failed-queue

Conversation

@loks0n
Copy link
Copy Markdown
Contributor

@loks0n loks0n commented Jun 1, 2026

Summary

Migrates the messaging.queue.depth metric from a synchronous gauge — recorded imperatively on the message hot path — to an observable gauge sampled by the telemetry SDK at each collection interval.

Previously the depth was recorded via Gauge::record() once at worker start and again in the finally after every processed message. That coupled the sampling cadence to throughput:

  • Idle / stuck queue → stale metric. If jobs stop flowing (all failing, no healthy consumer), record() never fires and the gauge goes stale exactly when it matters most.
  • High throughput → redundant work. A Redis listSize round-trip ran on the hot path for every message.

An observable gauge fixes both: the SDK invokes the registered observe() callback on a fixed export cadence, decoupled from message processing, and uses last-value semantics.

Changes

  • composer.json — bump utopia-php/telemetry 0.2.*0.4.* (adds createObservableGauge).
  • src/Queue/Server.php — swap Gauge/createGauge for ObservableGauge/createObservableGauge; register an observe() callback that reads getQueueSize() and reports it with the same attributes (messaging.destination.name/namespace) and the same Publisher-check and try/catch guards. Remove recordQueueDepth() and both call sites.
  • tests/.../ServerTelemetryTest.php — update to the pull model: assert against $telemetry->observableGauges and invoke the registered callback via a collectObservations() helper (one invocation = one collection cycle).

Not covered here

This does not by itself fix the Grafana double-counting on the dashboard — that's a fan-in problem (N workers/pods each reporting the same shared Redis depth under identical attributes), which needs the dashboard query to max/avg across series rather than sum(). The gauge also still only samples the pending queue, not the failed queue. Both are tracked as follow-ups.

Test plan

  • composer run test (ServerTelemetryTest 3/3 pass)
  • composer run check (phpstan clean, --memory-limit=512M)
  • composer run lint (pint passed)

🤖 Generated with Claude Code

loks0n and others added 2 commits June 1, 2026 14:05
Migrate messaging.queue.depth from a synchronous gauge recorded on the
message hot path to an observable gauge sampled by the telemetry SDK at
each collection interval. The depth now stays fresh even when the queue
is idle or stuck, instead of only being re-recorded once per processed
message, and the per-message getQueueSize() call leaves the hot path.

Requires utopia-php/telemetry 0.4.* for createObservableGauge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 1, 2026

Greptile Summary

Migrates messaging.queue.depth from a synchronous push-model Gauge (recorded on the message hot path) to a pull-model ObservableGauge whose callback is invoked by the telemetry SDK at each collection interval, decoupling queue-depth sampling from message throughput.

  • Server.php: Drops recordQueueDepth() and its two call sites; registers an observe() callback inside setTelemetry() that reads getQueueSize() only when invoked by the SDK, with the same Publisher guard and error-suppression logic as before.
  • tests/ServerTelemetryTest.php: Replaces the $telemetry->gauges assertions with $telemetry->observableGauges and a collectObservations() helper that manually drives one collection cycle per call, correctly modelling the pull-model semantics.

Confidence Score: 5/5

Safe to merge — the refactor is self-contained, all three guard paths are tested, and no hot-path logic is affected.

The change correctly moves queue-depth sampling out of the message hot path into a pull-model callback. The Publisher check, attribute set, and silent error handling are all preserved unchanged. Tests drive the new pull model correctly and cover the three relevant cases (publisher with sizes, non-publisher, and throwing publisher). No regressions are visible in the changed files.

No files require special attention.

Important Files Changed

Filename Overview
src/Queue/Server.php Removes recordQueueDepth() and both hot-path call sites; registers an observe() closure on the new ObservableGauge, preserving the Publisher guard, attributes, and silent error handling.
tests/Queue/E2E/Adapter/ServerTelemetryTest.php Tests updated to the pull model; collectObservations() helper drives each cycle independently, correctly exercising the stateful mock consumer and all three guard paths.
composer.json Bumps utopia-php/telemetry constraint from 0.2.* to 0.4.* to gain createObservableGauge.
composer.lock Lock file updated to telemetry 0.4.0 plus minor bumps to several Symfony packages; no unexpected dependency changes.

Reviews (1): Last reviewed commit: "Drop explanatory comments" | Re-trigger Greptile

@loks0n loks0n merged commit 9866251 into main Jun 1, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants